Use of Linked Data principles for semantic management of scanned documents Emprego dos princípios Linked Data para gestão semântica de documentos digitalizados
نویسندگان
چکیده
The study addresses the use of the Semantic Web and Linked Data principles proposed by the World Wide Web Consortium for the development of Web application for semantic management of scanned documents. The main goal is to record scanned documents describing them in a way the machine is able to understand and process them, filtering content and assisting us in searching for such documents when a decision-making process is in course. To this end, machine-understandable metadata, created through the use of reference Linked Data ontologies, are associated to documents, creating a knowledge base. To further enrich the process, (semi)automatic mashup of these metadata with data from the new Web of Linked Data is carried out, considerably increasing the scope of the knowledge base and enabling to extract new data related to the content of stored documents from the Web and combine them, without the user making any effort or perceiving the complexity of the whole process.
منابع مشابه
Interoperabilidade e Portabilidade de Documentos Digitais Usando Oontologias
Our purpose is to enable interoperability of documents and achieve portability of digital documents through the reuse of content and format in different plausible combinations. We propose the characterization of digital documents using ontologies as a solution to the problem of lack of interoperability in the implementations of document formats. As proof of concept we consider the portability b...
متن کاملJoint semantic discourse models for automatic multi-document summarization
Automatic multi-document summarization aims at selecting the essential content of related documents and presenting it in a summary. In this paper, we propose some methods for automatic summarization based on Rhetorical Structure Theory and Cross-document Structure Theory. They are chosen in order to properly address the relevance of information, multidocument phenomena and subtopical distributi...
متن کاملNormalização Textual e Indexação Semântica Aplicadas na Filtragem de SMS Spam
Resumo—Nos últimos anos, a popularização dos celulares e smartphones impulsionou o uso de SMS como forma alternativa e barata de comunicação. O crescimento de adeptos ao serviço aliado a alta confiança que os usuários possuem nesses tipos de mensagens, vêm atraindo a atenção de pessoas e empresas mal intencionadas, conhecidas como spammers. O spam nesse contexto representa um problema para os m...
متن کاملUn Sistema de Extracción de Información Basado en Ontologías para Documentos en el Dominio de las Tecnologías de Información An Ontology-Based Information Extractor for Data-Rich Documents in the Information Technology Domain
This paper presents an information extraction method, suitable for data-rich documents, based on the knowledge represented in a domain ontology. The extractor combines a fuzzy string matcher and a word sense disambiguation (WSD) algorithm. The fuzzy string matcher finds mentions of terms combining character-level and token-level similarity measures dealing with non-standardized acronyms and inc...
متن کاملProcessamento de consultas na Web de Dados: uma abordagem para busca de fontes de dados relevantes
The adoption of Linked Data principles has contributed towards the creation of a Web of Data, allowing the development of applications and tools which run queries over available information. One of the main challenges for the query processing over the Web is the selection of relevant sources, i.e., sources which could contribute significantly to the result of a query. In this paper, we discuss ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016